Advancements in the Time-Frequency Approach to Multichannel Blind Source Separation
نویسندگان
چکیده
The ability of the human cognitive system to distinguish between multiple, simultaneously active sources of sound is a remarkable quality that is often taken for granted. This capability has been studied extensively within the speech processing community andmany an endeavor at imitation has beenmade. However, automatic speech processing systems are yet to perform at a level akin to human proficiency (Lippmann, 1997) and are thus frequently faced with the quintessential "cocktail party problem": the inadequacy in the processing of the target speaker/s when there are multiple speakers in the scene (Cherry, 1953). The implementation of a source separation algorithm can improve the performance of such systems. Source separation is the recovery of the original sources from a set of observations; if no a priori information of the original sources and/or mixing process is available, it is termed blind source separation (BSS). Rather than rely on the availability of a priori information of the acoustic scene, BSS methods often employ an assumption on the constituent source signals, and/or an exploitation of the spatial diversity obtained through a microphone array. BSS has many important applications in both the audio and biosignal disciplines, including medical imaging and communication systems.
منابع مشابه
Blind Source Separation Using Mixtures of Alpha-Stable Distributions
We propose a new blind source separation algorithm based on mixtures of alpha-stable distributions. Complex symmetric alpha-stable distributions have been recently showed to better model audio signals in the time-frequency domain than classical Gaussian distributions thanks to their larger dynamic range. However, inference of these models is notoriously hard to perform because their probability...
متن کاملHärmä and Faller Spatial Decomposition
Techniques where a stereo or a multichannel signal is decomposed into spatial source-labeled time-frequency slots by level, time-difference, and coherence metrics have become popular in recent years. Good examples are binaural cue coding and up/downmixing techniques. In the article, we will provide an overview and discuss parallel approaches in the field of array processing and blind source sep...
متن کاملA unified approach for underdetermined blind signal separation and source activity detection by multichannel factorial hidden Markov models
This paper proposes to introduce a new model called “the multichannel factorial hidden Markov Model (MFHMM)” for underdetermined blind signal separation (BSS). For monaural source separation, one successful approach involves applying nonnegative matrix factorization (NMF) to the magnitude spectrogram of a mixture signal, interpreted as a non-negative matrix. Up to now, multichannel extensions o...
متن کاملFlexible multichannel blind deconvolution, an investigation
In this paper, we consider the issue of devising a flexible nonlinear function for multichannel blind deconvolution. In particular, we consider the underlying assumption of the source probability density functions. We consider two cases, when the source probability density functions are assumed to be uni-modal, and multimodal respectively. In the unimodal case, there are two approaches: Pearson...
متن کاملJoint Sound Source Separation and Speaker Recognition
Non-negative Matrix Factorization (NMF) has already been applied to learn speaker characterizations from single or nonsimultaneous speech for speaker recognition applications. It is also known for its good performance in (blind) source separation for simultaneous speech. This paper explains how NMF can be used to jointly solve the two problems in a multichannel speaker recognizer for simultaneo...
متن کامل